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Abstract 

/nfrat/i/cf/on; Spermatogenesis is a highly complex process involving several thousand genes, only a minority of which have 
been studied in infertile men. In a previous study, we identified a number of Copy Number Variants (CNVs) by high- 
resolution array-Comparative Genomic Hybridization (a-CGH) analysis of the X chromosome, including 16 patient-specific X 
chromosome-linked gains. Of these, five gains (DUPIA, DUP5, DUP20, DUP26 and DUP40) were selected for further analysis 
to evaluate their clinical significance. 

Materia/s and Methods: The copy number state of the five selected loci was analyzed by quantitative-PCR on a total of 276 
idiopathic infertile patients and 327 controls in a conventional case-control setting (199 subjects belonged to the previous 
a-CGH study). For one interesting locus (intersecting DUPIA) additional 338 subjects were analyzed. 

Results and Discussion: All gains were confirmed as patient-specific and the difference in duplication load between 
patients and controls is significant (p = 1.65x 10~''). Two of the CNVs are private variants, whereas 3 are found recurrently in 
patients and none of the controls. These CNVs include, or are in close proximity to, genes with testis-specific expression. 
DUPIA, mapping to the PARI, is found at the highest frequency (1.4%) that was significantly different from controls (0%) 
(p = 0.047 after Bonferroni correction). Two mechanisms are proposed by which DUPIA may cause spermatogenic failure: i) 
by affecting the correct regulation of a gene with potential role in spermatogenesis; ii) by disturbing recombination 
between PARI regions during meiosis. This study allowed the identification of novel spermatogenesis candidate genes 
linked to the 5 CNVs and the discovery of the first recurrent, X-linked gain with potential clinical relevance. 
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introduction 

Infertility is a multi-factorial disorder affecting approximately 
15% of couples - half of these can be attributed to the male. 
Currently known causes of male-factor infertility account for only 
60% of cases and known genetic factors contribute to about 15% 
of severe male factor infertility [1]. The most frequent molecular 
genetic cause is related to the Y chromosome and concerns the 
AZF deletions [2]. These deletions are the first example in 
andrology of functionally-relevant CNVs and can be easily studied 
with plus/minus PCR. Recently, the development of high- 
throughput analytical techniques such as a-CGH have allowed 
the screening of large numbers of loci and have been used with the 
principal aim of identifying novel spermatogenesis candidate 
genes. These studies have also been useful in identifying a CNV 
burden in infertile men, mainly involving the sex chromosomes 
[3-5]. 



Considering the high complexity of spermatogenesis, which 
requires more than 2,000 genes, it is highly likely that a proportion 
of the 40% 'missing' aetiology is linked to yet unknown genetic 
factors [1]. CNVs may induce a pathogenic effect in a number of 
ways: structural changes to regulatory regions or a numerical 
increase or decrease in protein-coding regions may have a direct 
effect on mRNA levels [6]; large-scale CNVs may cause changes 
to the well-regulated 3D structure formed by chromatin [7], 
leading to downstream effects on the regulation of protein-coding 
regions. Finally, large CNVs may also disturb chromosome pairing 
at the PAR regions during meiosis [8,9]. 

While the AZF region-linked genes have been extensively 
studied in respect to male infertility [10] very few studies have 
focussed on the X chromosome, despite its predicted enrichment 
in genes expressed in the testis [11,12]. Only a single X-linked 
gene has been shown to defmitively contribute to an infertility 
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phenotype, the androgen receptor (AR) [1], leaving huge scope for 
further experiments on this chromosome. Recently, evolutionary 
models demonstrated that the X chromosome progressively 
accumulating testis-expressed genes [11]. Accordingly, 18% of 
the 800 protein-coding genes on the human X chromosome do not 
show orthologs in other placental mammals. Of these, the majority 
have been acquired by the human X chromosome de novo or 
through transposition from the autosomes. Two-thirds of these are 
ampliconic, possessing duplicated 10 Kb segments with >99% 
homology and are predicted to be involved in male fitness [12]. 

In our previous study [3] , we analyzed the CNV status of 96 
infertile patients and 103 controls using a custom-designed 
8x60 K microarray targeting the X chromosome (Agilent 
Technologies, Santa Clara, CA, USA). Of the 44 gains identified, 
16 were patient-specific and the five most promising CNVs 
(DUPIA, DUP5, DUP20, DUP26 and DUP40) were selected for 
testing in an enlarged study population. 

Materials and Methods 

The local Ethics Committees of the University Hospital Careggi 
and the Fundacio Puigvert approved the study and consent 
procedure. All participants gave written, informed consent. The 
consent forms are stored locally at the University Hospital 
Careggi. All data were analysed anonymously. 

CNV Selection and Bioinformatic Analysis 

Five CNVs (DUPIA, DUP5, DUP20, DUP26 and DUP40) 
were selected from the 16 patient-specific gains identified in our 
previous study [3]. CNVs underwent several selection steps. 
Initially, this focused on the frequency at which CNVs were 
identified in the a-CGH study. CNVs found in control samples 
were excluded. Online data-sources, such as OMIM, gene 
ontology terms, and literature review were used to find candidate 
features within 0.5 Mb of the CNV minimum. Information about 
expression data was obtained from microarray and RNA-seq 
experiments deposited in the GermOnline database [13,14]. 
CNVs containing genomic features with a potential involvement 
in spermatogenesis were selected. At this regard, DUPIA resulted 
of particular interest and therefore it was subjected to a deepened 
in\ I'stigation. All genomic loci, in the previous study reported in 
Hgl8 gc;nome assembly [3], have been converted to Hgl9 
assembly and all loci in the current study are reported according 
to Hgl9. 

Study Population 

Patients with non-Italian or non-Spanish origin were excluded. 
Patients were recruited after a comprehensive clinical examination 
in both centres using the same chnical protocol, according to 
WHO guidelines [15]. Exclusion criteria included all known 
causes of infertility: history of mono- or bilateral cryptorchidism, 
varicocele (grades 2 and 3), obstructive azoospermia, recurrent 
infections, iatrogmiic infertility, hypogonadotrophic hypogonad- 
ism, karyotype anomaly and Y-chromosome microdeletions. 
Control samples were recruited from pre-vasectomy patients with 
proven fertility, male partners of infertile women (tubal factor, 
anovulation and endometriosis) and male volunteers from the 
general population. Given that the aim of the study was to define 
the effect of CNVs on spermatogenesis, the control group 
contained only subjects presenting with normal sperm parameters 
(according to the WHO manual). Additional information about 
sperm parameters is provided in the supplementary material and 
methods 81 in File 81. 



In addition to the previous study population of 199 samples (96 
patients and 103 controls) [3], 404 new subjects (180 patients and 
224 controls) were analyzed at the 5 selected loci. The total patient 

group (n = 276) used in this study consisted of 96 azoospermic, 83 
cryptozoospermic (<1.0xlO'^ spermatozoa. ml~') and 97 severe 
oligozoospermic subjects (between l.OxlO'' and S.Oxlo'' sperma- 
tozoa, ml ') of which 130 were Spanish and 146 Italian. The 
control group (n = 327) consisted of 107 Spanish and 220 Italian 
normozoospermic subjects. For DUPIA only, the further enlarge- 
ment of the study population with additional 158 patients (27 
azoospermic, 59 cryptozoospermic and 72 severe oligozoospermic) 
and 180 controls ensured the analysis of a total of 941 subjects. 

qPCR Analysis 

The [:opy number state of each locus was determined by qPCR. 
Primer sequences arc reported in Table S2 in File SI. Before 
testing, all samples were put through rigorous quality control to 
ensure that DNA quality and concentration was sufficient. DNA 
samples that showed contamination were re-precipitated, using 
ethanol precipitation. DNA samples were diluted to 50 ng.^lL ' in 
ddH20. Each sample was analyzed in triplicate in a 96-well plate. 
The SYBR Select Master Mix produced by Invitrogen (REF: 
4472908) was used. For each sample, PMP22 was amplified as a 
reference gene for analysis purposes. The reaction conditions were 
as follows: 20 ng DNA; 200 nM Primer (forward and reverse); 
SYBR Green SELECT Master mix (Ix concentration) in a total 
reaction volume of 20 |J,L. In the case of DUPIA, an additional 
qPCR was performed in order to test the LINC0()685/PPP2R3B 
gene dosage ratio. For this purpose, we tested the PPP2R3B copy 
number state in DUPIA carriers using a pair of primers solely 
mapping to the PPP2R3B gene. We used the same reaction 
conditions, but optimal concentration for these primers was 
400 nM (forward and reverse). qPCR was performed on TaqMan 
7900 HT on 'Absolute Quantification', using the pre-set 'Stan- 
dard' cycle conditions. The annealing temperature was 60°C. A 
non-targeting control and a 'duplicated' control (a Klinefelter 47, 
XXY man) was included on each plate. For DuplA, we 
simultaneously analysed sample 08-373 (carrying DUP4A) and 
A800 (carrying DUPIA) for whom CNVs had previously been 
identified by a-CGH (3). Threshold cycle and baseline were 
calculated automatically and relative quantification was deter- 
mined using the AACt analysis method. 

AU samples from the original a-CGH experiment (96 patients 
and 103 controls) were re-tested using the qPCR method 
validating a-CGH results. A single 'normal' result (absence of 
duplication in the tripUcate) was required to denote a normal gene 
copy number. In case duphcation was found, the analysis was 
repeated once for confirmation. A 'borderline' range was also 
possible and samples within this range were repeated; if a 
borderhne result turned into duplication, another experiment 
was performed for confirmation. 

Statistical Analysis 

SPSS software (version 20.0, Chicago, IL, USA) was used. CNV 

frequencies were analysed for significance using Fisher's Exact 
Test in the 'R' software package and corrected using Bonferroni- 
Holm step-down correction for multiple testing. 

Results 

Case-control Association Study 

The 5 selected CNVs (DUPIA, DUP5, DUP20, DUP26 and 
DUP40) were confirmed as patient-specific. Among the 276 
patients analyzed, DUP5 was found in two (0.72%), DUP20 and 
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DUP26 in one (0.36%), DUP40 in 3 (1.09%) and DUPIA in 4 
patients (1.44%). Given the higher frequency of DUPIA, 
additional samples were tested for this CNV. In this enlarged 
study population, we found DUPIA in a further two patients (2/ 
158; 1.26%) and in none of the additional 180 controls. 
Considering the total study population used for DUPIA only 
(434 patients and 507 controls), the difference in duplication 
frequency between patients and controls for this CNV reached 
statistical significance after Bonferroni-Holms correction for 
multiple testing (6/ 434 patients versus 0/507 controls; p = 0.047) 
(Table 1). DUPIA carriers displayed a heterogeneous semen 
phenotype ranging from azoospermia to severe oligozoospermia 
(Table 2). A phenotypic description of carriers of all CNVs is 
shown in Table 2. 

A comparison of sperm parameters between all CNV carriers 
and non-carriers was performed and is shown in Table S 1 in File 
SI. The most relevant results concern the Total Motile Sperm 
count, which resulted significantly lower in carriers of DUPIA 
(p = 0.008), DUP26 (p = 0.003) and DUP40 (p = 0.03) compared 
to non-carriers. 

Bioinformatic Analysis for Physical Characterization of 
CNVs 

The physical characteristics of the selected CNVs are shown in 
Table 3. DUPIA and DUP5 were of special interest because of 
their large size and location on the PARI. Locations of DUP20, 
DUP26 and DUP40 are Xp22.2, Xp21.1 and Xq21.1, respec- 
tively. No homology of sufficient size for non-allelic homologous 
recombination (NAHR) was found between the upper and lower 
boundaries of all CNVs indicating that a mechanism other than 
NAHR might have led to the formation of these recurrent gains. 
We checked for other types of repeated elements that could 
contribute to genome instability, such as Alu elements, and found 
that, for DUPIA, DUP5, DUP20 and DUP26, the flanking 
regions are filled with short interspersed nuclear elements (SINEs), 
which include Alus, and long interspersed nuclear elements 
(LINEs). As for DUP40, two LINE elements, belonging to the 
L1PA3 family 1, are located at the extremities of the CNV. Hence, 
we propose that the presence of SINEs and LINEs might underlie 
the generation of recurrent duplications. 

Gene Content of Interest and Search in the Database of 
Genomic Variants (DGV) 

DUPIA fuUy duplicates the foUowing genes: PLCXDl, GTFB6 
and LINC00685 (Table 3). Apparently, the most interesting is a 
long non-coding RNA, (LLNC0068S] predicted to act as a negative 
regulator of a gene [PPP2R3B} with a potential role in 
spermatogenesis (Figure 1). According to the a-CGH data, the 
PPP2R3B gene is affected by DUPIA only for an 1 1.7 Kb span. 



thus it is not duplicated but disrupted at intron 7-8 by the CNV 
(considering the minimum size). Similarly, qPCR data confirm the 
lack of complete duplication of this gene in all CNV carriers. 

Through DGV search within DUPIA, several CNVs (30 
duplications and 25 deletions referring to the variant esv27600) 
were found intersecting the PLCXDl and GTPBP6 loci (located in 
the proximal part of the duplication), and all of them were 
identified only in females without further notification on the 
phenotype. Four variants are reported in close proximity to the 
most interesting genes, LLNC00685 and PPP2R3B. As for 
LIMC00685, two distal variants are reported in DGV (4 Kb 
downstream), esv2219721 and esv266687: the former refers to a 
549 bp deletion found in only one man, whereas the latter refers to 
a 560 bp loss found in 82/1151 (7.1%) subjects (including men 
and women) analyzed. As for the PPP2R3B gene, three variants 
are reported in DGV and none of them affects the entire gene. 
The first (esv27923) refers to a 548 bp loss mapping to intron 10- 
1 1 of the gene found in one woman over a total of 45 1 subjects 
(0.22%); the second (esv25834) refers to a 2. 12 Kb region mapping 
to intron 7-8 of the gene, where both gains (n = 7) and losses 
(n = 4) were found in 1 1 women over a total of 45 1 subjects 
(2.24%); the third (esv2758560) refers to 3 gains (218.7 Kb) 
mapping to intron 1-2 of the gene, found in three men over 27 1 
subjects totally analysed (1.1%). AH these CNVs seemingly do not 
disturb the ratio between LLNC00685 and PPP2R3B copy number, 
as with DUPIA. 

Within the CNV minimum of DUP5 there is a single predicted 
microRNA element, AL732314.1 (Table 3). For DUP5, the 
already described esv2758560 variant is reported in DGV as well 
as a large number of small CNVs; one variant (nsv508745), 
describing three insertions (one in a woman and the other two in 
one man out of 270 belonging to the HapMap project), overlaps 
with the AL732314.1. No information about fertility of the single 
male carrier could be found. 

Within the CNV minimum of DUP20, two protein-coding 
genes are duplicated entirely (Table 3). Of interest is MSL3, an 
homologue of the Drosophila melanogaster homonymous gene, which 
is involved in X-chromosome dosage compensation [16]. A 
number of CNV annotations can be found in DGV. Of the 3 
merged variants (nsv523223: 2.4 Kb; nsv524154: 74.9 Kb; 
nsv526302: 140.8 Kb) representing 3 gains spanning the MSL3 
locus, no CNVs were found in males. The ten (nsv515163, 
esv2739963, esv270995, nsv6797, esv271693, nsv510814, 
nsv6799, esv2739964, nsv436626, nsv499329) common CNVs 
found upstream oi MSL3 axe all found at a much higher frequency 
in females than in males (28 women (1.66%) versus 10 men (0.59%) 
over a total of 1,682 subjects reported in the DGV). 

Both DUP26 and DUP40 do not contain any protein-coding 
genes but of interest is the presence of FAM47C, located 200 Kb 
downstream of DUP26, and HMGN5, located 139 Kb upstream of 



Table 1. CNV frequency and statistical analysis of case-control association study. 





CNV 


Frequency Patients 


Frequency Controls 


Raw p-value 


Corrected p-value^ 


DUPIA 


6/434 (1.38%) 


0/507 


0.01 


0.047 


DUP5 


2/276 (0.72%) 


0/327 


0.21 


0.63 


DUP20 


1/276 (0.36%) 


0/327 


0.45 


0.45 


DUP26 


1/276 (0.36%) 


0/327 


0.45 


0.45 


DUP40 


3/276 (1.09%) 


0/327 


0.097 


0.39 



^Corrected using Bonferroni-Holm Step-down correction for multiple testing. 
doi:l 0.1 371 /journal.pone.0097746.t001 
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Table 3. Physical Characteristics of CNVs selected for the study. 





CNV 


Start-end position 
(CNV Min)' 


Size 
(Min) 


Size 
(Max) 


Protein Coding within, or 
Substrate nearby CNV minimum 
for NAHR^ (witliin 0.5 Mb) 


Regulatory/RNA within, or nearby CNV 
minimum (within 0.5 Mb) 



DUPIA ChrX: 61544-306372' 245 Kb 247 Kb No PLCXD1 , GTPBP6, PPP2R3B'', UNC00685, AL732314.1 

SHOX 



DUP5 


ChrX: 382644-542740' 


160 Kb 


168 Kb 




PLCXDl, GTPBP6, PPP2R3B, SHOX LINC00685, AL732314.1, RPl 1~309M23.1 


DUP20 


ChrX: 11194597-11796693' 


602 Kb 


608 Kb 




MIDI, HCCS, ARHGAP6, AMELX, 
MSL3, FRMPD4, PRPS2, TLR7, 
TLR8, TMSB4X, FAM9C. 


RP11-120D5.1, AC002981.1, FRMPD4-AS1, TLR8-AS1, 
RPll'79m20.1, GS1-600G8.5 


DUP26 


ChrX: 37283466-37372045' 


89 Kb 


96 Kb 




FAM47C, PRRGf, TM4SF2\ 
LANCL3, XK, CYBB, DYNLT3, 
CXorf27, SYTL5. 


RNU6-49 



DUP40 ChrX: 80225590-80230870' 5.3 Kb 28 Kb No BRWD3, HMGN5, SH3BGRL ACA64, U6, AL357115.1 



^Genomic positions are in Hgl9. 

^Non-allelic Homologous Recombination (NAHR). 

^Short regions (<300 bp} with 80% homology were found, but were not considered sufficient for NAHR. 
^Gene crossed the minimum and maximum threshold, and is not fully duplicated. 
doi:1 0.1 371 /journal.pone.0097746.t003 



DUP40 (Table 3). Both these genes show testis-specific expression 
[14]. Concerning DUP26, three variants are reported to overlap 
with this CNV: esvl007820 (6.9 Kb), describing a gain found in a 
man without information on semen parameters; esv28598 (6.3 
Kb), describing a gain found in a woman over a total of 451 
subjects; esv2740093 (13.6 Kb), describing a loss found in one man 
over 96 subjects totally analyzed. None of these CNVs intersect 
the FAM47C directly but are located between 307.7-325.6 Kb 
downstream. Two losses (esv23127, size: 380.5 Kb; nsv510839, 
size: 74.4 Kb) and 1 gain (esv24190, size: 18.1 Kb) covered the 
entire DUP40 CNV minimum, whereas one gain (nsv436916, size: 
10.26 Kb) covered DUP40 for most of its size (3.8 Kb). Again, 
these CNVs do not intersect HMGN5 directly but are located 
between 238.3-281.1 Kb upstream. The 2 gains and the 2 losses 
were found at the same frequency in 2 women and 2 men over a 
total of 508 analyzed subjects (0.39%). No information on the 
men's fertility status was available. 

Discussion 

A recent study by Tiittelmann et al [4] provides the first 
statistically significant duplication burden on the X chromosome, 
reporting a significantly higher number of gains in azoospermic 
patients compared to both ohgozoospermic patients and normal 
controls. Our analysis of five X-linked CNVs also reveals a 



significandy higher duplication load in infertile compared to 
normozoospermic men. 

While four of the five CNVs (DUP5, DUP20, DUP26 and 
DUP40) studied did not individually reach statistical significance, 
they remained patient-specific. It is worth noting that rare variants 
have previously been predicted to play an important role in 
spermatogenic failure [5] due to the strong selection against 
highly-penetrant infertility-causing variants. All CNVs include, or 
are in close proximity to, genes with testis-specific expression and 
potential implication in spermatogenesis. DUP20 contains the 
MSL3 gene, a homologue of Dro.sophila melanogaster homonymous 
gene. Msl3 in D. melanogaster plays a critical role in the X- 
chromosome dosage-compensation pathway by directing histone 
H4 acetylation at lysine 16 (H4K16) and the human homologue is 
thought to have a similar function [17]. Although no information 
in humans is available, mice models provide evidence that this 
specific chromatin modification is dramatically increased in 
elongating spermatids and precedes histone replacement during 
spermatogenesis, as an initial step of nucleosome removal [18]. 
Both DUP26 and DUP40 are within close proximity of genes 
expressed exclusively in the testis. Inside and nearby DUP40 there 
is a dense area with epigenetic features indicating that DUP40 
may disrupt the epigenetic regulation of neighbouring genes. 
DUP5 is of particular interest due to its large size and location on 
the PARI as explained below. 



DUPIA. _ _ 

PLCXDl > SHOX s 

< GTPBP6 

< PPP2R3B 



< CRLF3 < SLC25A6 
C5F3RA > < P2RY8 

IL3RA > AKAP17A > 
< ASMTL ASMT > 



3. 00 M 

> < ARSD 
i > < ARSE 
GYG2 > ABSH > 
ARSF > 



CXort28 > 
< MXRA5 



250Kb 260Kb 270Kb 280Kb 290Kb .lOOKb 310Kb .!20Kb 330Kb 
I 100Kb 



LINC0068S > < PPP2R3B 



Figure 1. Position of DUPIA relative to LIN00685 and PPP2R3B on the X chromosome (PARI). Diagram of the Xp22.33, showing the 
presence of all known protein-coding genes (Red) and DUPIA (Blue). Enlarged is a 1 00 Kb region showing the location of PPP2R3B and LINC00685 in 
relation to the DUPIA minimum and maximum. This gain will certainly duplicate the antisense element LINC00685, but does not fully duplicate 
PPP2R3B - skewing the ration between the gene and its negative regulator. 
doi:1 0.1 371/journal.pone.0097746.g001 
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DUPIA was found at a significantly higlier fi-equency in 
patients. This gain contains a long non-coding RNA [LINC00685) 
tliat potentially acts as a negative regulator of a gene with potential 
role in spermatogenesis, PPP2R3B. This proposed mechanism 
could not be confirmed by functional studies because in vitro 
human spermatogenic cell culture is not available and this 
antisense is not present in easily accessible model organisms such 
as mouse and Drosophila. However, expression data provides 
evidence for an inverse relationship between PPP2R3B and 
LINC00685 levels in a number of different tissues (see Figure SI 
in File SI and Figure 2) [13,14]. Concerning the testis, those 
samples with a high expression of PPP2R3B show comparatively 
low expression oi LLNC00685 [13,14]. 

Figure 2 shows the changing levels of PPP2R3B and 
LINC00685 mRMA throughout spermatogenesis and the inverse 
relationship between PPP2R3B and LLNC00685. In tissues that 
lack germ cells, PPP2R3B levels are low, and L1NC()0685 levels are 
comparatively high (AdMinus, JSl andJS2). PPP2R3B levels rise 
with the presence of mitoticaUy active cells (AdPlus, JS3), and are 
highest in samples enriched in meioticaUy active cells JS5). 
Scoring system for testicular biopsies was obtained from Chalmel 
et al [14] and is reported in Table S3 in File SI. Again, in these 
tissues LINC00685 levels are decreased. Based on this observation, 
we propose that the mechanism by which DUPIA could lead to 
spermatogenic failure is through increased negative regulation, 
caused by the duplicated LLNC00685 that would decrease 
PPP2R3B transcription in the developing germ cells. This 
hypothesis is also supported by our qPCR analysis, which proved 
that the six patients with DUPIA, thus carrying an over-dosage of 
the antisense LINC00685, do not display a duplication of the entire 
PPP2R3B gene. 

Although the role of PPP2R3B in spermatogenesis has not been 
explored, indirect functional evidence supports its involvement 
botii in mitosis and meiosis. PPP2R3B encodes for a subunit of the 
protein phosphatase 2 (PP2) protein complex. PP2 is one of four 
major Ser/Thr phosphatases and has been implicated in a wide 
range of cellular activities. PP2 is a heterotrimeric protein, 
composed of a structural A-subunit, a catalytic C-subunit and a 
regulatory B-subunit. PPP2R3B belongs to this final group, giving 
specificity to the PP2 complex [19]. In mice, PPP2R35-PP2 was 
demonstrated to maintain a pool of dephosphorylated CDC6 - a 
replication-licensing factor [19]. CDC6 phosphorylation and 
dephosphorylation is necessary for the mitotic Gi to S-phase 
transition [20] . Accordingly, overexpression of PPP2R3B results in 
mitotic arrest at Gi phase [19]. Unlike somatic cells, following 
mitosis primary spermatocytes retain high levels of CDC6 [20], 
indicating a possible further role for this protein in meiosis. 



Expression analysis of testis biopsies with different histology 
patterns supports the involvement of PPP2R3B in mitosis where 
PPP2R3B levels are higher in samples containing mitoticaUy active 
cells compared to those lacking germ cells. More importantiy, the 
most pronounced expression is observed in biopsies with 
spermatocytic arrest i.e. those enriched in meiotic cells, indicating 
an additional role for this gene in meiosis [14]. As mentioned 
above, LLNC00685 levels show an opposite trend, strongly 
suggesting a functional link between the two gene products 
[13,14] (see supplementary discussion S2, Table S3 and Figure SI 
in File SI). 

Given that genes that have been recently incorporated into the 
human X chromosome are involved in male reproductive fitness 
[12] it is worth noting that PPP2R3B has been recently acquired 
on this chromosome. 

Although no CNVs were found in the database of genomic 
variants (DGV) intersecting PPP2R3B or LINC00685, re-analysis 
of raw data deposited in dbVar by Tilttelmann et al. [4] shows that 
a number of CNVs (5 gains and 2 losses) were found in the PAR 1 
of azoospermic men. Focusing on the gains, the variant nsv869733 
(243.8 Kb), describing 3 gains mapping to the PPP2R3B locus, was 
found exclusively in azoospermic patients (n = 4). The CNV 
minimums of two of these gains (one mapping to chrX: 298, 292— 
322, 672 and the other mapping to chrX: 298, 292-330, 801) 
begin in PPP2R3B intron 1—2 and end in intron 12-13, 
respectively. These CNVs are likely to lead to a non-functional 
protein by causing an internal duplication. The other CNV 
(mapping to chrX: 291, 285-336, 040) begins 3.4 Kb upstream of 
the PPP2R3B locus and ends in intron 12-13, which would disrupt 
the gene at exon 1 as well as the promoter. Moreover, the presence 
of a predicted promoter 1 Kb upstream of PPP2R3B and several 
regions enriched in histone methylation suggests that these gains 
may disrupt the correct regulation oi PPP2R3B. The duplication of 
the PPP2R3B locus has been reported also in men with abnormal 
karyotype and AZF deletions and in a single man with normal 
karyotype and AZFb deletion [21]. However, in these cases the 
infertile phenotype is clearly related to the karyotype and Y 
chromosome defects. The observations from Tilttelmann, together 
with ours, showing the relatively high frequency at which DUPIA 
was found exclusively in patients and its potential link to PPP2R3B 
gene expression, strongly indicate that PPP2R3B could be 
considered a novel spermatogenesis candidate gene. 

Large CNVs of the PARI region, like DUPIA and DUP5, may 
lead to impaired spermatogenesis also through a structural effect 
disturbing male meiosis i.e. altering the recombination event that 
occurs between the PARI regions of the sex chromosomes [8,9]. 
PARI recombination becomes progressively more frequent 




PPP2R3B:219264_s_at ■ LINC0068S:232463_at 



Figure 2. Relative expression of PPP2R3B and UNC00685 in testes biopsies in patients with different phenotypes (Johnsen Score) 
(Chalmel, et al., 2012). Taken from EBI Expression Atlas, (http://www.ebi.ac.uk/gxa/home). 
doi:1 0.1 371/journal.pone.0097746.g002 
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towards the distal telomeric boundary [22], where DUPIA and 
DUP5 are located, showing the importance of this region during 
meiosis. For instance, the merged analysis of data by Tiittelmann 
and our previous study shows that no PAR-linked CNVs were 
found in the vicinity of these 2 CNVs in normozoospermic 
controls, with the exception of one subject found by Krausz et al. 
to be carrying DUP4A. However, this CNV only partially overlaps 
with the DUPIA (at its distal boundary) and DUP5 (at its proximal 
boundary). This observation may indicate that: i) the pathogenic 
effect of DUPIA is more likely related to the misbalanced 
LINC00685/PPP2R3B gen(; dosage effect; ii) the portions of 
DUPIA and DUP5 that are not overlapping with DUP4A could 
be affc-cting specific sites of importance for X-Y pairing during 
meiosis. 

Finally, the combination of all data discussed above supports the 
importance of the PARI -linked CNVs in male infertility. Our 
most relevant finding is the identification of the first recurrent, X- 
linked gain associated with spermatogenic failure, DUPIA. Two 
possible mechanisms have been provided to explain the patho- 
genesis of the associated infertile phenotype - one identifying and 
regarding a novel spermatogenesis candidate gene and another 
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